441 research outputs found
An Autoencoder-Based Image Descriptor for Image Matching and Retrieval
Local image features are used in many computer vision applications. Many point detectors and descriptors have been proposed in recent years; however, creation of effective descriptors is still a topic of research. The Scale Invariant Feature Transform (SIFT) developed by David Lowe is widely used in image matching and image retrieval. SIFT detects interest points in an image based on Scale-Space analysis, which is invariant to change in image scale. A SIFT descriptor contains gradient information about an image patch centered at a point of interest. SIFT is found to provide a high matching rate, is robust to image transformations; however, it is found to be slow in image matching/retrieval. Autoencoder is a method for representation learning and is used in this project to construct a low-dimensional representation of a high-dimensional data while preserving the structure and geometry of the data. In many computer vision tasks, the high dimensionality of input data means a high computational cost. The main motivation in this project is to improve the speed and the distinctness of SIFT descriptors. To achieve this, a new descriptor is proposed that is based on Autoencoder. Our newly generated descriptors can reduce the size and complexity of SIFT descriptors, reducing the time required in image matching and image retrieval
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
In reinforcement learning, domain randomisation is an increasingly popular
technique for learning more general policies that are robust to domain-shifts
at deployment. However, naively aggregating information from randomised domains
may lead to high variance in gradient estimation and unstable learning process.
To address this issue, we present a peer-to-peer online distillation strategy
for RL termed P2PDRL, where multiple workers are each assigned to a different
environment, and exchange knowledge through mutual regularisation based on
Kullback-Leibler divergence. Our experiments on continuous control tasks show
that P2PDRL enables robust learning across a wider randomisation distribution
than baselines, and more robust generalisation to new environments at testing
ODAM: Gradient-based instance-specific visual explanations for object detection
We propose the gradient-weighted Object Detector Activation Maps (ODAM), a
visualized explanation technique for interpreting the predictions of object
detectors. Utilizing the gradients of detector targets flowing into the
intermediate feature maps, ODAM produces heat maps that show the influence of
regions on the detector's decision for each predicted attribute. Compared to
previous works classification activation maps (CAM), ODAM generates
instance-specific explanations rather than class-specific ones. We show that
ODAM is applicable to both one-stage detectors and two-stage detectors with
different types of detector backbones and heads, and produces higher-quality
visual explanations than the state-of-the-art both effectively and efficiently.
We next propose a training scheme, Odam-Train, to improve the explanation
ability on object discrimination of the detector through encouraging
consistency between explanations for detections on the same object, and
distinct explanations for detections on different objects. Based on the heat
maps produced by ODAM with Odam-Train, we propose Odam-NMS, which considers the
information of the model's explanation for each prediction to distinguish the
duplicate detected objects. We present a detailed analysis of the visualized
explanations of detectors and carry out extensive experiments to validate the
effectiveness of the proposed ODAM.Comment: 2023 International Conference on Learning Representation
Deep Reinforcement Learning for Resource Management in Network Slicing
Network slicing is born as an emerging business to operators, by allowing
them to sell the customized slices to various tenants at different prices. In
order to provide better-performing and cost-efficient services, network slicing
involves challenging technical issues and urgently looks forward to intelligent
innovations to make the resource management consistent with users' activities
per slice. In that regard, deep reinforcement learning (DRL), which focuses on
how to interact with the environment by trying alternative actions and
reinforcing the tendency actions producing more rewarding consequences, is
assumed to be a promising solution. In this paper, after briefly reviewing the
fundamental concepts of DRL, we investigate the application of DRL in solving
some typical resource management for network slicing scenarios, which include
radio resource slicing and priority-based core network slicing, and demonstrate
the advantage of DRL over several competing schemes through extensive
simulations. Finally, we also discuss the possible challenges to apply DRL in
network slicing from a general perspective.Comment: The manuscript has been accepted by IEEE Access in Nov. 201
Understanding the Performance of Learning Precoding Policy with GNN and CNNs
Learning-based precoding has been shown able to be implemented in real-time,
jointly optimized with channel acquisition, and robust to imperfect channels.
Yet previous works rarely explain the design choices and learning performance,
and existing methods either suffer from high training complexity or depend on
problem-specific models. In this paper, we address these issues by analyzing
the properties of precoding policy and inductive biases of neural networks,
noticing that the learning performance can be decomposed into approximation and
estimation errors where the former is related to the smoothness of the policy
and both depend on the inductive biases of neural networks. To this end, we
introduce a graph neural network (GNN) to learn precoding policy and analyze
its connection with the commonly used convolutional neural networks (CNNs). By
taking a sum rate maximization precoding policy as an example, we explain why
the learned precoding policy performs well in the low signal-to-noise ratio
regime, in spatially uncorrelated channels, and when the number of users is
much fewer than the number of antennas, as well as why GNN is with higher
learning efficiency than CNNs. Extensive simulations validate our analyses and
evaluate the generalization ability of the GNN
Demand Forecast in Retail Assortment Optimization—Based on an Empirical Analysis of Beverage Sales
This paper focus on establishing the demand forecasting model to optimize product assortments from a set of SKUs in the same category. The aim of the model is to achieve revenue maximization. Based on the attribute level, the demand model considers the consumers’ preference and the possibility of substitution between different attributes. Then it divides the product’s specific attributes and multiplies these attributes effects. Furthermore, one beverage case was applied to the demand model to do empirical analysis. Top beverage categories were selected and e-commerce sales data were collected to represent the pre-sale of whole categories. Moreover, a store named S with some beverage SKUs is assumed and applied to the model, which predicted sales volume of each existing SKU and the total revenue
Generalisation in deep reinforcement learning with multiple tasks and domains
A long standing vision of robotics research is to build autonomous systems that can
adapt to unforeseen environmental perturbations and learn a set of tasks progressively.
Reinforcement learning (RL) has shown great success in a variety of robot control
tasks because of recent advances in hardware and learning techniques. To further fulfil this long term goal, generalisation of RL arises as a demanding research topic as
it allows learning agents to extract knowledge from past experience and transfer to
new situations. This covers generalisation against sampling noise to avoid overfitting,
generalisation against environmental changes to avoid domain shift, and generalisation
over different but related tasks to achieve lifelong knowledge transfer. This thesis investigates these challenges in the context of RL, with a main focus on cross-domain
and cross-task generalisation.
We first address the problem of generalisation across domains. With a focus on
continuous control tasks, we characterise the sources of uncertainty that may cause
generalisation challenges in Deep RL, and provide a new benchmark and thorough
empirical evaluation of generalisation challenges for state of the art Deep RL methods.
In particular, we show that, if generalisation is the goal, then the common practice of
evaluating algorithms based on their training performance leads to the wrong conclusions about algorithm choice. Moreover, we evaluate several techniques for improving
generalisation and draw conclusions about the most robust techniques to date.
From the evaluation, we can see that learning from multiple domains improves
generalisation performance across domains. However, aggregating gradient information from different domains may make learning unstable. In the second work, we propose to update the policy to minimise the sum of distances to the new policies learned
in each domain in every iteration, measured by Kullback-Leibler (KL) divergence of
output (action) distributions. We show that our method improves both the training
asymptotic reward and testing policy robustness against domain shifts in a variety of
control tasks.
We finally investigate generalisation across different classes of control tasks. In
particular, we introduce a class of neural network controllers that can realise four distinct tasks: reaching, object throwing, casting, and ball-in-cup. By factorising the
weights of the neural network, transferable latent skills are exacted which enable acceleration of learning in cross-task transfer. With a suitable curriculum, this allows
us to learn challenging dexterous control tasks like ball-in-cup from scratch with only
reinforcement learning
- …